End-to-end acoustic modelling for phone recognition of young readers
نویسندگان
چکیده
Automatic recognition systems for child speech are lagging behind those dedicated to adult in the race of performance. This phenomenon is due high acoustic and linguistic variability present caused by their body development, as well lack available data. Young readers’ additionally displays peculiarities, such slow reading rate presence mistakes, that hardens task. work attempts tackle main challenges phone modelling young with limited data improve understanding strengths weaknesses a wide selection model architectures this domain. We find transfer learning techniques highly efficient on end-to-end adult-to-child adaptation small amount Through learning, Transformer complemented Connectionist Temporal Classification (CTC) objective function, reaches error 28.1%, outperforming state-of-the-art DNN–HMM 6.6% relative, other more than 8.5% relative. An analysis models’ performance two specific tasks (isolated words sentences) provided, showing influence utterance length attention-based CTC-based models. The Transformer+CTC an ability better detect mistakes made children, which can be attributed CTC function effectively constraining attention mechanisms monotonic.
منابع مشابه
End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow
This article discusses strategies for end-to-end training of stateof-the-art acoustic models for Large Vocabulary Continuous Speech Recognition (LVCSR), with the goal of leveraging TensorFlow components so as to make efficient use of large-scale training sets, large model sizes, and high-speed computation units such as Graphical Processing Units (GPUs). Benchmarks are presented that evaluate th...
متن کاملComparison of nerve repair with end to end, end to side with window and end to side without window methods in lower extremity of rat
Abstract Background : Although, different studies on end-to-side nerve repair, results are controversial. The importance of this method in case is unavailability of proximal nerve. In this method, donor nerves also remain intact and without injury. In compare to other classic procedures, end-to-side repair is not much time consuming and needs less dissection. Overall, the previous studies i...
متن کاملEnd-to-end esophagojejunostomy versus standard end-to-side esophagojejunostomy: which one is preferable?
Abstract Background: End-to-side esophagojejunostomy has almost always been associated with some degree of dysphagia. To overcome this complication we decided to perform an end-to-end anastomosis and compare it with end-to-side Roux-en-Y esophagojejunostomy. Methods: In this prospective study, between 1998 and 2005, 71 patients with a diagnosis of gastric adenocarcinoma underwent total gastrec...
متن کاملEnd-to-End Trust Starts with Recognition
Pervasive computing requires some level of trust to be established between entities. In this paper we argue for an entity recognition based approach to building this trust which differs from starting from more traditional authentication methods. We also argue for the concept of a “pluggable” recognition module which allows different recognition schemes to be used in different circumstances. Fin...
متن کاملEnd-to-end Audiovisual Speech Recognition
Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-toend audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs). To the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Speech Communication
سال: 2021
ISSN: ['1872-7182', '0167-6393']
DOI: https://doi.org/10.1016/j.specom.2021.08.003